PostGIS and Spatial Data Types#

Dataset used is restaurants dataset from PostGIS in Action textbook.

from sqlalchemy import create_engine
import pandas as pd

# experimenting with holoviews
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')
engine_str = "postgresql+psycopg2://docker:docker@0.0.0.0:25432/restaurants"
engine = create_engine(engine_str)
%load_ext sql
%sql $engine.url
'Connected: docker@restaurants'

Import the restaurants dataset csv to postgres

%%sql
DROP TABLE IF EXISTS restaurants CASCADE;

CREATE TABLE restaurants (
    id SERIAL,
    name VARCHAR(3),
    latitude REAL,
    longitude REAL,
    PRIMARY KEY (id)
);
 * postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
Done.
Done.
[]

CSV is mounted in in postgres docker container.

%%sql
COPY restaurants(name, latitude, longitude)
FROM '/opt/data/restaurants.csv'
DELIMITER ',';
 * postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
50002 rows affected.
[]

PostGIS extensions are available in the docker image I am using. We need to enable the base PostGIS extension for this database.

%%sql
SELECT * 
FROM pg_available_extensions
WHERE comment SIMILAR TO '%%PostGIS%%';
 * postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
11 rows affected.
name default_version installed_version comment
postgis_topology-3 3.2.1 None PostGIS topology spatial types and functions
postgis_sfcgal 3.2.1 None PostGIS SFCGAL functions
postgis_tiger_geocoder-3 3.2.1 None PostGIS tiger geocoder and reverse geocoder
postgis_tiger_geocoder 3.2.1 None PostGIS tiger geocoder and reverse geocoder
postgis-3 3.2.1 None PostGIS geometry and geography spatial types and functions
postgis_sfcgal-3 3.2.1 None PostGIS SFCGAL functions
postgis_topology 3.2.1 None PostGIS topology spatial types and functions
postgis_raster 3.2.1 None PostGIS raster types and functions
postgis_raster-3 3.2.1 None PostGIS raster types and functions
postgis 3.2.1 3.2.1 PostGIS geometry and geography spatial types and functions
pointcloud_postgis 1.2.1 None integration for pointcloud LIDAR data and PostGIS geometry data
%sql CREATE EXTENSION IF NOT EXISTS postgis;
 * postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
Done.
[]

Using PostGIS geometry and geography types to store the coordinate information. The geography type has special methods available specifically for spherical coordinate systems.

%%sql
    ALTER TABLE restaurants
    ADD COLUMN IF NOT EXISTS geom geometry,
    ADD COLUMN IF NOT EXISTS geog geography;

    UPDATE restaurants
    SET geom = ST_Point(longitude, latitude, 4326);

    UPDATE restaurants
    SET geog = geom::geography;
 * postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
Done.
50002 rows affected.
50002 rows affected.
[]

Lets check out our data and confirm the SRID is correct.

%sql SELECT *, ST_SRID(geog) AS geog_srid, ST_SRID(geom) as geom_srid FROM restaurants LIMIT 10;
 * postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
10 rows affected.
id name latitude longitude geom geog geog_srid geom_srid
1 BKG 25.8092 -80.24 0101000020E6100000000000205C0F54C0000000C027CF3940 0101000020E6100000000000205C0F54C0000000C027CF3940 4326 4326
2 BKG 25.8587 -80.2094 0101000020E6100000000000C0660D54C0000000C0D3DB3940 0101000020E6100000000000C0660D54C0000000C0D3DB3940 4326 4326
3 BKG 25.8471 -80.2415 0101000020E6100000000000C0740F54C000000080DBD83940 0101000020E6100000000000C0740F54C000000080DBD83940 4326 4326
4 BKG 25.8306 -80.2661 0101000020E6100000000000C0071154C000000040A2D43940 0101000020E6100000000000C0071154C000000040A2D43940 4326 4326
5 BKG 25.9377 -80.2449 0101000020E610000000000080AC0F54C0000000200DF03940 0101000020E610000000000080AC0F54C0000000200DF03940 4326 4326
6 BKG 29.187 -82.1061 0101000020E610000000000060CA8654C000000040DF2F3D40 0101000020E610000000000060CA8654C000000040DF2F3D40 4326 4326
7 BKG 27.9256 -82.7874 0101000020E6100000000000C064B254C000000020F4EC3B40 0101000020E6100000000000C064B254C000000020F4EC3B40 4326 4326
8 BKG 41.8426 -87.7052 0101000020E61000000000000022ED55C000000060DAEB4440 0101000020E61000000000000022ED55C000000060DAEB4440 4326 4326
9 BKG 34.0096 -85.9876 0101000020E6100000000000E0347F55C0000000A03A014140 0101000020E6100000000000E0347F55C0000000A03A014140 4326 4326
10 BKG 32.9516 -117.062 0101000020E6100000000000C0F7435DC000000000CE794040 0101000020E6100000000000C0F7435DC000000000CE794040 4326 4326

Let’s visualize the points. I’d like to get familiar with the holoviews library, lets try it out.

with engine.connect() as con:
    restaurants = pd.read_sql('SELECT ST_X(geom) as lon, ST_Y(geom) as lat, name FROM restaurants;', con)
vdims = [('name', 'Restaurant')]
ds = hv.Dataset(restaurants, ['lon', 'lat'], vdims)
points = hv.Points(ds)
points.opts(opts.Points(color='name', width=600, height=400, cmap='tab20', legend_position='top_left'))

How can I find the closest restaurants to Hunter College? as we are using the Geography type, we can find the minimum geodesic distance in meters.

%%sql

-- Hunter College coordinate retrieved from google maps
SELECT name, ST_Distance(ST_Point(-73.9644861858826, 40.768010416903955)::geography, geog) as distance_from_hunter 
FROM restaurants
ORDER BY distance_from_hunter ASC
LIMIT 10
;
 * postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
10 rows affected.
name distance_from_hunter
KFC 579.59174361
MCD 662.42972091
MCD 822.3617656
WDY 899.24830005
MCD 948.3807947
MCD 1038.67077814
MCD 1047.4709654
MCD 1322.93878124
MCD 1342.24631002
TCB 1418.26061805